home *** CD-ROM | disk | FTP | other *** search
- *******************************************************************************
-
- CONST.ACC
- Version 1
- November 6, 1985
- by Randy Forgaard
- CompuServe 70307,521
-
- This file presents some hints for choosing values for the special constants
- required by the Turbo Access portion of the Turbo Pascal implementation of the
- Turbo Database Toolbox (formerly the Turbo Toolbox), versions 1.0 and 1.1. It
- applies to all operating systems and computers for which the Database Toolbox
- is available. There are no hard facts in this file that are not also in the
- Toolbox manual, but the hints below may help if your program is going haywire
- and you suspect that the values of the Turbo Access constants may be the source
- of the problem. These hints might also help increase the speed of Turbo Access
- as used by your program.
-
- *******************************************************************************
-
-
- The Turbo Access portion of the Turbo Database Toolbox asks the programmer to
- declare 6 integer constants in the Turbo Pascal program, namely MaxDataRecSize,
- MaxKeyLen, PageSize, Order, PageStackSize, and MaxHeight, prior to the {$I}
- directives that bring in the Toolbox source files. Bad values for these
- constants can result in anything from poor performance to mysterious program
- crashes. Below, these constants are discussed individually.
-
-
- MaxDataRecSize
- --------------
-
- MaxDataRecSize is the size of the largest record you will be storing in any
- DataFile. That is, if you are going to have two kinds of DataFiles, one to
- store records of type R1, and one to store records of type R2, and R2 records
- occupy more storage than R1 records, MaxDataRecSize should be set to the number
- of bytes occupied by records of type R2. If MaxDataRecSize is larger than
- necessary, Turbo Access will still work properly, but some memory will be
- wasted.
-
- Suppose that your program is to have two kinds of DataFiles, some of which hold
- records of type Person, and some of which hold records of type Loan, where
- these record types are defined as follows:
-
- type type
- Person = record Loan = record
- name: String[30]; company: String[40];
- age: Byte; secured: Boolean;
- married: Boolean months: Integer;
- end; interestRate, payment: Real
- end;
-
- MaxDataRecSize must be set to the size of Person or Loan, whichever is larger.
- It is dangerous to compute MaxDataRecSize by hand. For example, in computing
- the size of Person, one might say that Person occupies 30 + 1 + 1 = 32 bytes.
- But the actual answer is 33 bytes (the String[30] type occupies 31 bytes, due
- to the additional byte for the length). In computing the size of Loan, one
- might forget that there are actually two Reals, even though they are listed on
- one line. Furthermore, the number of bytes occupied by a Real could be 6, 8,
- or 10 bytes, depending on whether regular Turbo, Turbo-87, or Turbo BCD is
- being used.
-
- To be sure to pick an appropriate value for MaxDataRecSize, create a little
- Turbo program that includes the type definitions for the records you will be
- storing in DataFiles, and write out the size of each of those records using
- Turbo's built-in SizeOf function, so that you can choose the maximum of those
- values. For the above record types, one could write a program like this:
-
- program DisplaySizes;
-
- <...type definitions of Person and Loan go here...>
-
- begin
- writeln('Size of Person = ', SizeOf(Person));
- writeln('Size of Loan = ', SizeOf(Loan))
- end.
-
- Compile and run this small program under the same version of Turbo that you
- will be using for your real program. Running this program under regular Turbo
- gives us 33 bytes for the size of Person, and 56 bytes for the size of Loan.
- Under Turbo-87 we get 33 and 60 bytes, respectively, and Turbo BCD yields 33
- and 64. Assuming that we are using regular Turbo, we would set the value of
- MaxDataRecSize to 56 (the larger of 33 and 56). To be safe, in case we decide
- to use Turbo BCD in the future and forget to change MaxDataRecSize accordingly,
- we might set MaxDataRecSize to 64.
-
- It is very important that MaxDataRecSize be the correct value, or larger. A
- value for MaxDataRecSize that is too small results in mysterious and erratic
- program behavior, and the cause of the problem can be very difficult to find.
-
-
- MaxKeyLen
- ---------
-
- MaxKeyLen is the length of the longest keys you will be using for any of your
- IndexFiles. In Turbo Access, all keys are strings, and MaxKeyLen is a string
- length. For example, if your longest keys are 25 characters long, you would
- use String[25] as the type of any variables that are to hold those keys, and
- you would set MaxKeyLen to 25.
-
- Note that this is a different idea from MaxDataRecSize. The obvious difference
- is that MaxKeyLen refers to the keys in the IndexFiles, and MaxDataRecSize
- refers to the data records in the DataFiles. The subtle difference is that
- MaxKeyLen is the length of the string, _not_ the number of bytes such a string
- would occupy. If your keys are up to 25 characters long, set MaxKeyLen to 25.
- However, if a String[25] is part of a data record, the String[25] would have to
- be counted as 26 bytes for the purposes of computing MaxDataRecSize.
-
-
- PageSize
- --------
-
- PageSize is the number of key entries in each page of every IndexFile used by
- your program. It must be an even number between 4 and 254, inclusive. Beyond
- this restriction, it is difficult to choose a "correct" value for PageSize; it
- is a performance/space trade-off, and a non-linear one at that. If you choose
- a value for PageSize that is too small, Turbo Access will have to traverse many
- IndexFile pages during a search, which usually means of a lot of disk I/O. A
- value for PageSize that is too large will use up a lot of memory without
- yielding a proportionately larger increase in execution speed.
-
- The issue is further complicated in that a single value for PageSize must be
- chosen that will be used for all IndexFiles in your program, even though
- different PageSize values will be optimal for IndexFiles with different maximum
- key lengths. For the purpose of choosing a PageSize value, identify the
- IndexFile whose access speed is most critical to your application. In what
- follows, let K denote the maximum key length of that IndexFile.
-
- Some tests with Turbo Access, on both a hard disk and a floppy, seem to suggest
- the following rule of thumb for achieving good time/space efficiency: Choose
- PageSize so that the product of PageSize and K is close to 2000. Smaller
- values for PageSize can cause Turbo Access to run measurably slower. Larger
- values for PageSize will use up valuable memory space that could be more
- profitably used by increasing PageStackSize (see below), rather than PageSize.
-
-
- Order
- -----
-
- The value of Order is simply half of the value of PageSize. Since PageSize is
- always even, Order will be an integral value. The name "order," in the
- terminology of trees, refers to the number of children each node of the tree
- has. A binary tree is order 2. In B+ trees, the number of children varies,
- but is always at least half of the PageSize. Hence the name Order for half of
- PageSize.
-
-
- PageStackSize
- -------------
-
- PageStackSize is the number Pages that Turbo Access keeps internally, as a
- cache, so that it does not need to read pages from disk as often. The value of
- PageStackSize must be greater than or equal to 3. The Toolbox manual notes
- that the "minimum reasonable value for PageStackSize is the value of MaxHeight"
- (see below). Indeed, empirically, extraordinary performance degradation does
- seem to result if PageStackSize is less than MaxHeight.
-
- Like PageSize, the choice of a value for PageStackSize is a performance/space
- trade-off. Larger values for PageStackSize will allow Turbo Access to run
- faster, but will also use more of the memory that the Turbo compiler sets aside
- for global variables in your program. After PageSize has been chosen as per
- above, choose the largest possible value for PageStackSize that will still
- permit the rest of your program to have the memory for global variables that it
- needs. This may be a trial-and-error process, since large values for
- PageStackSize may cause a "Memory Overflow" error from Turbo while compiling
- global variable declarations in either the Turbo Access source files or in your
- own code.
-
-
- MaxHeight
- ---------
-
- MaxHeight is the maximum height that the B+ tree in any IndexFile can attain.
- It is a function of the PageSize and the maximum number of keys (including
- duplicates, if permitted) that you will allow in any of the IndexFiles your
- program uses. You can find the correct value for MaxHeight by running the
- following Turbo program, which implements the MaxHeight formula given in the
- Toolbox manual:
-
- program FindMaxHeight;
- var
- PageSize, MaxHeight: Integer;
- MaxKeyCount: Real;
- begin
- write('PageSize: ');
- readln(PageSize);
- write('Maximum number of keys that can be stored in any IndexFile: ');
- readln(MaxKeyCount);
- MaxHeight := Round(Ln(MaxKeyCount) / Ln(PageSize * 0.5)) + 1;
- writeln('MaxHeight = ', MaxHeight)
- end.
-
- Increasing MaxHeight by one only adds 4 bytes to each IndexFile variable your
- program maintains. Thus, just in case you underestimate the maximum number of
- keys that will be stored in any IndexFile, you might want to add one or two
- onto the value of MaxHeight computed by the above program.
-
-
- General Notes
- ------- -----
-
- After selecting values for the above constants, compiling your program, and
- creating database files with those constant values in force, it is possible
- that you might want to change those constant values. With MaxDataRecSize,
- MaxKeyLen, PageStackSize, and MaxHeight, you can change the constant (subject
- to the constraints stated in the paragraphs above), recompile your program, and
- still be able to read IndexFiles that were created under the old constant
- values.
-
- After changing PageSize and Order, however, you will no longer be able to read
- IndexFiles created with the old values for PageSize and Order. In this case,
- you will have to rebuild the IndexFiles with the new values for PageSize and
- Order in effect.
-
- To rebuild IndexFiles, you can always read an old DataFile, "d," by doing a
- GetRec(d,i,r) for each data record number "i" from 1 to FileLen(d)-1, and
- bypass the IndexFiles entirely. However, you will need to make sure that you
- have reserved the first two bytes of each data record, so that you can tell
- whether it has been deleted when reading it with GetRec. When rebuilding
- IndexFiles, you will want to skip any data records marked as deleted. See the
- "Reuse of Deleted Data Records" section of the Toolbox manual for details.
-
- Finally, some notes about the versions of Database Toolbox and their
- compatibility with the versions of Turbo:
- (1) If you are using DOS Turbo 3.0, make sure you have Turbo 3.01A or later
- (version 3.00B does not work with Turbo Access).
- (2) If you are using DOS Turbo 3.0, make sure you use ACCESS3.BOX in place of
- ACCESS.BOX. ACCESS3.BOX is included on the Turbo 3.0 disk (not the
- Database Toolbox disk).
- (3) If you have version 1.0 of the Database Toolbox, see the file TBXFIX in
- DL 1 of the Borland SIG on CompuServe, to upgrade your Toolbox to
- version 1.1 (fixes some bugs).
- If you are not running under DOS, or if you are using Turbo 2.0, only (3) above
- applies to you.
-
- If you have any further questions about Turbo Access, or would like to discuss
- some aspect of it, please feel free to ask on CompuServe's Borland SIG, and/or
- to send a message to me personally on the Borland SIG or via EasyPlex.